nd is removed from the candidate outlier list,

ܢିൌܢି

ି

(6.30)

candidate outlier (ݖ

ି) is also inserted into the tight cluster set,

ܢൌܢ∪ݖ

ି

(6.31)

൐ݐ, ݖ

ି is predicted as an outlier (perhaps a high extreme outlier)

xamination of ݖ

ି൐ݖ

ି, ∀ݖ

ି∈ܢି is terminated. If ݐ൏െݐ, ݖ

ି

ted as an outlier (perhaps a low extreme outlier) and the

ion of ݖ

ି൏ݖ

ି, ∀ݖ

ି∈ܢି is terminated.

cover DEGs when outlier genes are present — simulated data

ated toy data set was generated for the demonstration of

ng DEGs when there are outliers present. In this data set, 400

Gs and 50 up-regulated DEGs as well as 50 down-regulated DEGs

igned. The replicate number was 20. The number of outliers

om one to five. Outliers were inserted into 10% of non-DEGs.

ns that 40 non-DEGs had outliers. Outliers were also inserted into

DEGs. Therefore, ten down-regulated DEGs and ten up-regulated

ad outliers. Figure 6.18 shows three types of genes in this

d toy data. Figure 6.18(a) shows a non-DEG in which one control

and one case replicate were the outliers. In Figure 6.18(b), two

eplicates were the outliers for an up-regulated DEG. In Figure

wo case replicates were the outliers for a down-regulated DEG.

outliers appear, the traditional t test may not work well. Even the

t test may not work well. Figure 6.19 shows an experiment of

presence of outliers in the data has led to potential misleading

ng the t test. The error statistics were also included in the figure.

o to ten down-regulated DEGs were mis-classified. From six to

n-regulated DEGs were misclassified. Note that the maximum

of down-regulated DEGs was ten and the maximum number of

ated DEGs was ten as well. There was no prediction error for the

Gs when using the t test. Therefore, no Type I error. This shows